智能论文笔记

ReAssigner: A Plug-and-Play Virtual Machine Scheduling Intensifier for Heterogeneous Requests

Haochuan Cui , Junjie Sheng , Bo Jin , Yiqiu Hu , Li Su , Lei Zhu , Wenli Zhou , Xiangfeng Wang

分类：人工智能

2022-11-29

With the rapid development of cloud computing, virtual machine scheduling has become one of the most important but challenging issues for the cloud computing community, especially for practical heterogeneous request sequences. By analyzing the impact of request heterogeneity on some popular heuristic schedulers, it can be found that existing scheduling algorithms can not handle the request heterogeneity properly and efficiently. In this paper, a plug-and-play virtual machine scheduling intensifier, called Resource Assigner (ReAssigner), is proposed to enhance the scheduling efficiency of any given scheduler for heterogeneous requests. The key idea of ReAssigner is to pre-assign roles to physical resources and let resources of the same role form a virtual cluster to handle homogeneous requests. ReAssigner can cooperate with arbitrary schedulers by restricting their scheduling space to virtual clusters. With evaluations on the real dataset from Huawei Cloud, the proposed ReAssigner achieves significant scheduling performance improvement compared with some state-of-the-art scheduling methods.

translated by 谷歌翻译

Enhanced artificial intelligence-based diagnosis using CBCT with internal denoising: Clinical validation for discrimination of fungal ball, sinusitis, and normal cases in the maxillary sinus

Kyungsu Kim , Chae Yeon Lim , Joong Bo Shin , Myung Jin Chung , Yong Gi Jung

分类：计算机视觉

2022-11-29

The cone-beam computed tomography (CBCT) provides 3D volumetric imaging of a target with low radiation dose and cost compared with conventional computed tomography, and it is widely used in the detection of paranasal sinus disease. However, it lacks the sensitivity to detect soft tissue lesions owing to reconstruction constraints. Consequently, only physicians with expertise in CBCT reading can distinguish between inherent artifacts or noise and diseases, restricting the use of this imaging modality. The development of artificial intelligence (AI)-based computer-aided diagnosis methods for CBCT to overcome the shortage of experienced physicians has attracted substantial attention. However, advanced AI-based diagnosis addressing intrinsic noise in CBCT has not been devised, discouraging the practical use of AI solutions for CBCT. To address this issue, we propose an AI-based computer-aided diagnosis method using CBCT with a denoising module. This module is implemented before diagnosis to reconstruct the internal ground-truth full-dose scan corresponding to an input CBCT image and thereby improve the diagnostic performance. The external validation results for the unified diagnosis of sinus fungal ball, chronic rhinosinusitis, and normal cases show that the proposed method improves the micro-, macro-average AUC, and accuracy by 7.4, 5.6, and 9.6% (from 86.2, 87.0, and 73.4 to 93.6, 92.6, and 83.0%), respectively, compared with a baseline while improving human diagnosis accuracy by 11% (from 71.7 to 83.0%), demonstrating technical differentiation and clinical effectiveness. This pioneering study on AI-based diagnosis using CBCT indicates denoising can improve diagnostic performance and reader interpretability in images from the sinonasal area, thereby providing a new approach and direction to radiographic image reconstruction regarding the development of AI-based diagnostic solutions.

translated by 谷歌翻译

Tackling Visual Control via Multi-View Exploration Maximization

Mingqi Yuan , Xin Jin , Bo Li , Wenjun Zeng

分类：机器学习 | 人工智能 | 计算机视觉

2022-11-28

We present MEM: Multi-view Exploration Maximization for tackling complex visual control tasks. To the best of our knowledge, MEM is the first approach that combines multi-view representation learning and intrinsic reward-driven exploration in reinforcement learning (RL). More specifically, MEM first extracts the specific and shared information of multi-view observations to form high-quality features before performing RL on the learned features, enabling the agent to fully comprehend the environment and yield better actions. Furthermore, MEM transforms the multi-view features into intrinsic rewards based on entropy maximization to encourage exploration. As a result, MEM can significantly promote the sample-efficiency and generalization ability of the RL agent, facilitating solving real-world problems with high-dimensional observations and spare-reward space. We evaluate MEM on various tasks from DeepMind Control Suite and Procgen games. Extensive simulation results demonstrate that MEM can achieve superior performance and outperform the benchmarking schemes with simple architecture and higher efficiency.

translated by 谷歌翻译

OSIC: A New One-Stage Image Captioner Coined

Bo Wang , Zhao Zhang , Mingbo Zhao , Xiaojie Jin , Mingliang Xu , Meng Wang

分类：计算机视觉

2022-11-04

Mainstream image caption models are usually two-stage captioners, i.e., calculating object features by pre-trained detector, and feeding them into a language model to generate text descriptions. However, such an operation will cause a task-based information gap to decrease the performance, since the object features in detection task are suboptimal representation and cannot provide all necessary information for subsequent text generation. Besides, object features are usually represented by the last layer features that lose the local details of input images. In this paper, we propose a novel One-Stage Image Captioner (OSIC) with dynamic multi-sight learning, which directly transforms input image into descriptive sentences in one stage. As a result, the task-based information gap can be greatly reduced. To obtain rich features, we use the Swin Transformer to calculate multi-level features, and then feed them into a novel dynamic multi-sight embedding module to exploit both global structure and local texture of input images. To enhance the global modeling of encoder for caption, we propose a new dual-dimensional refining module to non-locally model the interaction of the embedded features. Finally, OSIC can obtain rich and useful information to improve the image caption task. Extensive comparisons on benchmark MS-COCO dataset verified the superior performance of our method.

translated by 谷歌翻译

Rewarding Episodic Visitation Discrepancy for Exploration in Reinforcement Learning

Mingqi Yuan , Bo Li , Xin Jin , Wenjun Zeng

分类：机器学习 | 人工智能

2022-09-19

探索对于具有高维观察和稀疏奖励的复杂环境中的深度强化学习至关重要。为了解决这个问题，最新的方法旨在利用内在的奖励来改善勘探，例如基于新颖的探索和基于预测的探索。但是，许多固有的奖励模块需要复杂的结构和表示学习，从而导致了过度的计算复杂性和不稳定的性能。在本文中，我们提出了一种有益的情节访问差异（REVD），这是一种计算有效且量化的探索方法。更具体地说，REVD通过评估情节之间的基于R \'Enyi Divergence的访问差异来提供内在的奖励。为了进行有效的差异估计，使用随机定义状态编码器使用K-Nearest邻居估计器。最后，在Pybullet机器人环境和Atari游戏上测试了REVD。广泛的实验表明，REVD可以显着提高强化学习算法的样本效率，并优于基准测定方法。

translated by 谷歌翻译

GPPF: A General Perception Pre-training Framework via Sparsely Activated Multi-Task Learning

Benyuan Sun , Jin Dai , Zihao Liang , Congying Liu , Yi Yang , Bo Bai

分类：计算机视觉

2022-08-03

在混合完成的多任务，多域和多模式数据上进行预训练仍然是视力感知预训练的开放挑战。在本文中，我们提出了GPPF，这是一个普遍的感知预训练框架，预先培训任务级的动态网络，该网络是由在标签的多任务和多域数据集上的各层知识“乐高”组成的。通过检查人类在复杂环境中学习的先天能力，我们识别并将三个关键要素转移到深网上：（1）同时暴露于每个批次中的各种交叉任务和跨域信息。（2）由知识共享驱动的单独的乐高单元中的分区知识存储。（3）用于训练和下游任务的乐高单元子集的稀疏激活。值得注意的是，由于其在输入形状，损失功能，输出格式，数据分布等方面的差异，不同视觉任务的联合培训是不平凡的。因此，我们创新地开发了插件的多任务培训算法，该培训算法是支持单个迭代多个任务（SIMT）同时培训。 Simt用大型多任务多任务数据集为预训练的基础奠定了基础，并且被证明对于我们的GPPF实验中的稳定培训至关重要。令人兴奋的是，详尽的实验表明，我们的GPPF-R50型号在GPPF-15M中的8个预训练预培训任务的强大基线上取得了显着改善，并在22个下游任务中收获了一系列SOTA，并具有相似的计算预算。我们还验证了GPPF对SOTA视觉变压器的概括能力，并具有一致的改进。这些可靠的实验结果充分证明了我们新颖的GPPF框架提供的有效的知识学习，存储，共享和转移。

translated by 谷歌翻译

Efficient Private SCO for Heavy-Tailed Data via Clipping

Chenhan Jin , Kaiwen Zhou , Bo Han , James Cheng , Ming-Chang Yang

分类：机器学习

2022-06-27

我们考虑对重尾数据的随机凸优化，并保证成为私人（DP）。此问题的先前工作仅限于梯度下降（GD）方法，这对于大规模问题效率低下。在本文中，我们解决了此问题，并通过剪辑得出了私人随机方法的第一个高概率范围。对于一般凸问题，我们得出过多的人口风险$ \ tilde {o} \ left（\ frac {d^{1/7} \ sqrt {\ ln \ frac {（n \ epsilon） }}} {（n \ epsilon）^{2/7}}} \ right）$和$ \ tilde {o} \ left（\ frac {d^{1/7} \ ln \ ln \ frac {（n \ epsilon）^（n \ epsilon）^ 2} {\ beta d}} {（n \ epsilon）^{2/7}}} \ right）$分别在有限或无限的域假设下（此处$ n $是样本大小，$ d $是数据，$ \ beta $是置信度，$ \ epsilon $是私人级别）。然后，我们将分析扩展到强烈的凸情况和非平滑案例（可用于使用H $ \ ddot {\ text {o}} $ lder-lder-continuule梯度的通用光滑目标）。我们建立了新的超额风险界限，而没有有限的域名。在相应情况下，上面的结果比现有方法降低了多余的风险和梯度复杂性。进行数值实验以证明理论改进是合理的。

translated by 谷歌翻译

Beyond Greedy Search: Tracking by Multi-Agent Reinforcement Learning-based Beam Search

Xiao Wang , Zhe Chen , Bo Jiang , Jin Tang , Bin Luo , Dacheng Tao

分类：计算机视觉 | 人工智能 | 机器学习

2022-05-19

为了跟踪视频中的目标，当前的视觉跟踪器通常采用贪婪搜索每个帧中目标对象定位，也就是说，将选择最大响应分数的候选区域作为每个帧的跟踪结果。但是，我们发现这可能不是一个最佳选择，尤其是在遇到挑战性的跟踪方案（例如重闭塞和快速运动）时。为了解决这个问题，我们建议维护多个跟踪轨迹并将光束搜索策略应用于视觉跟踪，以便可以识别出更少的累积错误的轨迹。因此，本文介绍了一种新型的基于梁搜索策略的新型多代理增强学习策略，称为横梁。它主要是受图像字幕任务的启发，该任务将图像作为输入，并使用Beam搜索算法生成多种描述。因此，我们通过多个并行决策过程来将跟踪提出作为样本选择问题，每个过程旨在将一个样本作为每个帧的跟踪结果选择。每个维护的轨迹都与代理商相关联，以执行决策并确定应采取哪些操作来更新相关信息。处理所有帧时，我们将最大累积分数作为跟踪结果选择轨迹。在七个流行的跟踪基准数据集上进行了广泛的实验证实了所提出的算法的有效性。

translated by 谷歌翻译

Obtaining Dyadic Fairness by Optimal Transport

Moyi Yang , Junjie Sheng , Xiangfeng Wang , Wenyan Liu , Bo Jin , Jun Wang , Hongyuan Zha

分类：机器学习 | 人工智能

2022-02-09

Fairness has been taken as a critical metric in machine learning models, which is considered as an important component of trustworthy machine learning. In this paper, we focus on obtaining fairness for popular link prediction tasks, which are measured by dyadic fairness. A novel pre-processing methodology is proposed to establish dyadic fairness through data repairing based on optimal transport theory. With the well-established theoretical connection between the dyadic fairness for graph link prediction and a conditional distribution alignment problem, the dyadic repairing scheme can be equivalently transformed into a conditional distribution alignment problem. Furthermore, an optimal transport-based dyadic fairness algorithm called DyadicOT is obtained by efficiently solving the alignment problem, satisfying flexibility and unambiguity requirements. The proposed DyadicOT algorithm shows superior results in obtaining fairness compared to other fairness methods on two benchmark graph datasets.

translated by 谷歌翻译

Knowledge Graph Augmented Network Towards Multiview Representation Learning for Aspect-based Sentiment Analysis

Qihuang Zhong , Liang Ding , Juhua Liu , Bo Du , Hua Jin , Dacheng Tao

分类：自然语言处理 | 人工智能

2022-01-13

基于宽高的情绪分析（ABSA）是一种细粒度的情绪分析任务。为了更好地理解长期复杂的句子，并获得准确的方面的信息，这项任务通常需要语言和致辞知识。然而，大多数方法采用复杂和低效的方法来结合外部知识，例如，直接搜索图形节点。此外，尚未彻底研究外部知识和语言信息之间的互补性。为此，我们提出了一个知识图形增强网络（kgan），该网络（kgan）旨在有效地将外部知识与明确的句法和上下文信息纳入。特别是，kgan从多个不同的角度来看，即基于上下文，语法和知识的情绪表示。首先，kgan通过并行地了解上下文和句法表示，以完全提取语义功能。然后，KGAN将知识图形集成到嵌入空间中，基于该嵌入空间，基于该嵌入空间，通过注意机制进一步获得了方面特异性知识表示。最后，我们提出了一个分层融合模块，以便以本地到全局方式补充这些多视图表示。关于三个流行的ABSA基准测试的广泛实验证明了我们康复的效果和坚固性。值得注意的是，在罗伯塔的预用模型的帮助下，Kggan实现了最先进的性能的新记录。

translated by 谷歌翻译